AITopics | spectral initialization

Collaborating Authors

spectral initialization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample efficient inductive matrix completion with noise and inexact side information

Yang, Yuepeng, Ma, Cong

arXiv.org Machine LearningMay-19-2026

Low-rank matrix completion is a widely studied problem with many variants. Inductive matrix completion (IMC) incorporates row and column side information to significantly narrow the search space. Prior work falls into two regimes: methods that exploit this structure to achieve reduced sample complexity but only in noiseless settings, and methods that handle noise but require sample complexity matching the ambient matrix dimension, forfeiting the sample efficiency that side information should provide. In this paper, we close this gap by studying noisy IMC with a nonconvex projected gradient descent algorithm with spectral initialization. Our main technical contribution is establishing a regularity condition for the IMC loss function that holds at the reduced sample complexity determined by the effective problem size, scaling with the side information dimension a rather than the ambient dimension n. This directly yields linear convergence and an estimation error that both depend only on the effective problem size rather than the ambient matrix dimension. We further extend our analysis to the inexact side information setting, demonstrating that the reduced sample complexity is maintained and the estimation error is order-optimal with respect to the inexactness of the side information. Extensive simulations and real-world experiments on the MovieLens dataset validate our theoretical findings.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2605.17189

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow

Gang Wang, Georgios Giannakis

Neural Information Processing SystemsMar-23-2026, 10:28:38 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, initialization, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Injecting Multimodal Information into Rigid Protein Docking via Bi-level Optimization

Neural Information Processing SystemsFeb-15-2026, 00:48:06 GMT

The structure of protein-protein complexes is critical for understanding binding dynamics, biological mechanisms, and intervention strategies.

bioinformatics, information, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(2 more...)

Add feedback

77fa0e7d45c6687f1958de0b31e9fc05-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 22:40:47 GMT

bioinformatics, information, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(2 more...)

Add feedback

Learning single index model with gradient descent: spectral initialization and precise asymptotics

Chen, Yuchen, Shen, Yandi

arXiv.org Machine LearningSep-30-2025

Non-convex optimization plays a central role in many statistics and machine learning problems. Despite the landscape irregularities for general non-convex functions, some recent work showed that for many learning problems with random data and large enough sample size, there exists a region around the true signal with benign landscape. Motivated by this observation, a widely used strategy is a two-stage algorithm, where we first apply a spectral initialization to plunge into the region, and then run gradient descent for further refinement. While this two-stage algorithm has been extensively analyzed for many non-convex problems, the precise distributional property of both its transient and long-time behavior remains to be understood. In this work, we study this two-stage algorithm in the context of single index models under the proportional asymptotics regime. We derive a set of dynamical mean field equations, which describe the precise behavior of the trajectory of spectral initialized gradient descent in the large system limit. We further show that when the spectral initialization successfully lands in a region of benign landscape, the above equation system is asymptotically time translation invariant and exponential converging, and thus admits a set of long-time fixed points that represents the mean field characterization of the limiting point of the gradient descent dynamic. As a proof of concept, we demonstrate our general theory in the example of regularized Wirtinger flow for phase retrieval.

assumption 2, diag, nullnull null, (15 more...)

arXiv.org Machine Learning

2509.23527

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.40)

Industry: Education > Focused Education > Special Education (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals

Cloez, Bertrand, Cotil, Adrien, Menassol, Jean-Baptiste, Verzelen, Nicolas

arXiv.org Machine LearningSep-22-2025

Graphs have become extremely useful for representing a wide variety of systems in different contexts: biological, social, information... A basic attempt to study them may consist in partitioning the vertices of a graph into clusters that are more densely connected; that is commonly called community detection or graph clustering; see for instance [20, 1]. Community detection and clustering are central problems in machine learning and data science. In particular, the stochastic block model (SBM) [34, 25] has been widely used as a canonical model for community detection and as a building block for clustering with more structural assumptions. In its most general form, the SBM corresponds to a randomly weighted graph model where each node has an unobserved label and the probability of observing a given edge between two nodes depends only on the labels of the nodes under consideration.

algorithm, algorithm 2, initialization, (15 more...)

arXiv.org Machine Learning

2509.15989

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Understanding Incremental Learning with Closed-form Solution to Gradient Flow on Overparamerterized Matrix Factorization

Min, Hancheng, Vidal, René

arXiv.org Artificial IntelligenceAug-29-2025

Many theoretical studies on neural networks attribute their excellent empirical performance to the implicit bias or regularization induced by first-order optimization algorithms when training networks under certain initialization assumptions. One example is the incremental learning phenomenon in gradient flow (GF) on an overparamerterized matrix factorization problem with small initialization: GF learns a target matrix by sequentially learning its singular values in decreasing order of magnitude over time. In this paper, we develop a quantitative understanding of this incremental learning behavior for GF on the symmetric matrix factorization problem, using its closed-form solution obtained by solving a Riccati-like matrix differential equation. We show that incremental learning emerges from some time-scale separation among dynamics corresponding to learning different components in the target matrix. By decreasing the initialization scale, these time-scale separations become more prominent, allowing one to find low-rank approximations of the target matrix. Lastly, we discuss the possible avenues for extending this analysis to asymmetric matrix factorization problems.

artificial intelligence, initialization, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2508.20344

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

To Reviewer 1 (R1)

Neural Information Processing SystemsAug-19-2025, 23:04:42 GMT

We thank all three reviewers for their constructive comments. We address them below one by one. Q1: what makes it nontrivial to extend the regularity condition and proof technique in [11] to Riemmanian optimization. The Grassmannian manifold is nonconvex, making the analysis more complex. We will incorporate these into a revised version of the manuscript.

grassmannian, reviewer 1, subdifferential, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

c82836ed448c41094025b4a872c5341e-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 08:14:55 GMT

artificial intelligence, initialization, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Frequency-Constrained Learning for Long-Term Forecasting

Kong, Menglin, Zheng, Vincent Zhihao, Sun, Lijun

arXiv.org Artificial IntelligenceAug-5-2025

However, modern deep forecasting models often fail to capture these recurring patterns due to spectral bias and a lack of frequency-aware inductive priors. Motivated by this gap, we propose a simple yet effective method that enhances long-term forecasting by explicitly modeling periodicity through spectral initialization and frequency-constrained optimization. Specifically, we extract dominant low-frequency components via Fast Fourier Transform (FFT)-guided coordinate descent, initialize sinusoidal embeddings with these components, and employ a two-speed learning schedule to preserve meaningful frequency structure during training. Our approach is model-agnostic and integrates seamlessly into existing Transformer-based architectures. Extensive experiments across diverse real-world benchmarks demonstrate consistent performance gains--particularly at long horizons--highlighting the benefits of injecting spectral priors into deep temporal models for robust and interpretable long-range forecasting. Moreover, on synthetic data, our method accurately recovers ground-truth frequencies, further validating its interpretability and effectiveness in capturing latent periodic patterns.

frequency, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.01508

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback